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J INTRODUCTION 



1 Introduction 
1.1 Proteomics 

The field of proteo mi cs has gained Importance over the last tea years. After the 
human genome was sequenced . in year 2000, the challenge has been to Interpret 
this large amount of information for improving health cam and discovering new 
drugs. This Is where proteomics enter the arena. 

The term proteome refers to all proteins produced by a living organism, much 
as the genome is the entire set of genes in an organism. Proteomics indicates 
proteins expressed by a genome and is the systematic analysis of protein profiles. 
It. was not until 1995 that the actual term proteomics was introduced. 

When examining the genome one looks as stationary information that does 
not change over time. This means that it is difficult to understand which role a 
certain gene plays In the dynamic process, as Irving is. On the other hand, the 
proteome varies over time and is defined as the proteins present iu a sample at a 
certain point in time. Therefore, the proteome will make it possible to indicate 
which part of the genome that is active in a process. This is why proteomics 
parallels the related field of genomics. 

Proteome research is far more complex than genome research, because the 
proteins in a cell can not be amplified, its presence is dynamic and its solubility 
is variable. To overcome this complexity and still got useful inform af.iou from the 
analysis, three major steps in proteome research exist. They ore the fallowing 

1. Separation of individual proteins by two-dimenaioaal poiyacrylarnide gel 
electrophoresis. 

2. Identification by mass spectrometry or N-termlnal sequencing of individual 
proteins recovered tram the gel. 

3. Storage, manipulation, and comparison of the data using 

Naturally, every step mentioned above is equally important, but for further 
understanding of this master's thesis only the first step is of interest. Tins part 
wi(J be briefly introduced below. For more details about proteomics please refer 
to Jain fl|. 



1.2 Ttvo-Dimensional Gel Electrophoresis 

About 25 years ago fcfce two-dimensional gel electrophoresis (2-DE) was intro- 
duced and described by O'Parxell (2) and Wose fS]. Since then, 2-DE has been 
used in a diverse ranges of applications, where separation of protchis is essential. 
In the beg inn i n g the technique was rough and it had quite a few pitfalls and 
difficulties. Over the pasl. year* these problems have been partly eliminated by 
new improved 2-DE techniques and software, but they still exists. Despite this, 
the 2-DE importance in proteomics has grown and still today it is unparalleled 
in its ability to separate and array complex proteins 

Before the 2-DE separation technique can be appEed, a protein sample has to 
bo extracted from the examined organism. The sample has to be pure and free 
of other contaminating substances, otherwise the separation will be disturbed 
or even fail. There exists numerous techniques to purify the proteins and today 
it is not problematic, see Berkelman (4{. 
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After receiving the protein sample, it is placed on a bo called strip. Too 
strip is made of polyacrylamide gel, contains a pH gradient and la about ten 
centimeters long and one centimeter wide. Because of the pH gradient, the 
proteins will separate according to their isoelectric points over a period of about 
ten hours. When, this is done the strip and the one-dimensional separated 
proteins axe transferred to a second dimension, according to thedr moteniilar 
weights. The transformation is done by placing the strip on the side of a plate 
consisting of sodium, dodecyl sulfato-pdy&crylamidc gel. An electric field is 
applied over the plate and the strip, forcing the proteins to merge into the plate 
at different speeds depending on their molecular weight: ' In this way the proteins 
have been separated both in a pH and a molecular dimension, for a schematic 
view of the separation process, see Figure 1, below. 



(teoel^ctrm focusing) 



Figure I: Schematic view of the 2-DE separation 

When the separation of the proteins has been completed, the gel has to be 
stained. This will mate the non-colored separated proteins visible to the naked 
eye. There exist several different staining terhntgnps and substances, which all 
have their advantages and disadvantages. The common way to stain eels is with 
a silver solution that colon) proteins black, see Berkelman [4j. 

Finally, the stained gel is inserted into a gel scanner and typically transferred 
to a gray scale digitised TIP image. This image is fed to a gel software in a 
computer and further analyzed. For more details about 2-DE techniques and 
its application please refer to Ong [5]. 



1.3 Master's Thesis Problem Definition 

Digitised images received from 2-DE protein separations are very complex. Of- 
ten many different types of protein exists in. a protein sample, which In turn 
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1 INTRODUCTION 3 

yields many different protein spots in the image. Tjrpicaflv a number of be- 
tween 1000 and 3000 protein spots are present and visible in" an image. 

If a scientist wants to know which type and the amoimt of proteins a sample 
contains, the 2-DE image has to be evaluated, lb do fc his manualry is very 
time cenaaming and practically non-performable. It is estimated that one image * 
requires approximately five hours or more of manual work. In a normal scientific ' 
study, with several protein samples, as many as 50 images have to be evaluated. 
Thus, manual work is impossible if research should he fiat and cost effective. 

In recent years computer software have been developed to rninhnfae the 
cvaiuation.famc. JSven though there exist many software today, quite a few of 
them are not very reliable and still require many hours of manual work. 

1-4 Tlie Goal of This Master s Thesis 

The goal of this master's thesis is to investigate the- possibility to segmentate 
protein spots from 2-0E images, whh the help of evolving interfaces. The goal is 
also to automate the segmentation process as much as possible, while containing 
reliability, accuracy and speed. 

1.5 Organization of This Report 

The material in this report Is divided into ten chapters. In chapter-two a closer 
look at the images to be segmented will be made. Problems, certain types' 
and examples of images will be shown and clarified. Moving on to chapter 
three, previous work done in 2-DE segmentation is discussed and two examples 
of well established approaches are given. The theory of image processing is 
IntrodJiced mrJiaptcr ibur. Chapter five explains the methodology of evolving 
Interface* with Uie numerical approximation scheme, the Fast Marching Method 
In chapter she the implementation of the segmentation system is described. A 
comparison and evaluation of the proposed segmentation system is conducted in 
chapter seven. The implementation proposal in this master's thesis is far from 
complete and suggestions on further improvements arc given in chapter right 
Chapter nine concludes this master's thesis and finally chapter ten acWlXs 
involved persons. 
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2 A Closer Look at 2-DB Images 

l^^S' 0 ££ i a, "" rt technic M brief* de- 

-iJ^l Uepr0dUC83 * result "Misting rftvro pans. ^ fat '* the 

to «uapte, when It « mtenating to do an identl6catiun by mas apa^-oSS 
orjs-tax^rf sequencing rf individual proteins. Recall, 5l&1Tlta!£tt5 
after 2-DE nnaly^ introduced In Section 1-1. The otto part S ^Sliced 
^aee produ^dwikh a gei manner. TUs imagefaftutlertKrnS^ 

™ E£i ba ? * ^ ^ step, in sS^SdS 

two following. Both the gel plate and the 2-DE imen> are etoselv «.«™!^ 
during a fun investigation of the proteome. ^ connected. 

2.1 Good Quality 2-DE Images 

2* reftSnSS, of outmost importance that they an, of good 

1. Clear and focused images. 

2. Smooth Images without noise. 

3. Images with as little background variations as posaible. 

4. Non-saturated images. 

5. Evenly spread and well separated proteins throughout the image* 

6. Images free from artifacts, such as non-protein patterns, 

^quality of the images depends on the protein separation steps thestain- 
n^echmque and *. scanning. In each of these rtJSS2« 
c^es^ch deexoso possibffities to end up with a good image, fracases 
only a very skilled and experienced 2-DE analyzer deu^a^ty^^ 

2.2 Examples of 2-DE Images 

o^^ST 5BU> ^ * Itcontamsapproxunatdyl^ 

drfereat proteins, m the image some basic feature am marked. These*** 

• B!ack which have a high concentration of certain 
SSS^i ^J?"**} s P ota «a vary in size and shape, but they are 
commonly elliptic or close to circular. 

• Overlapping SpoU: When two protein spots are to dose to each other 

th °y <™**P- This mean* that their area of tfstnT>uttoa 

• Bacfcewtmfi- Area in the image, which does not contain any protein spots. 

• Varying Xackorvund: Intensity variations of the background. 

• Artifact: Object which appear in the Image and is not a protein spot. 
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2 A CLOSER LOOK AT 2-DE IMAGES 




Figure 2: A typical 2-DE image. 

Two low quality images are illustrated In figure 3 and Figure 4. The first 
Image is saturated. A saturated image is often the result when too ranch staining 
substance has been used in proportion to the amount of proteins contained in 
the sample. Protein spots will be cut off and their shape in three 62menaionfi 
will not be true. This may create difficulties when o^anttfying proteins in such 
an image. 




Figure 3: A saturated 2-DE image. 

The next image, Figure 4, Is of poor quality because it contains too much 
proteins in the introduced sample. Long stripes exist in the image and cover 
other interesting proteins. A correct evaluation is difficult to perform. The Sep- 
aration foiled when the first dimension was transferred to the second dimension. 
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Figure 4: An overloaded 2-DE image. 



Over the last couple of years, since the proteomics area bflgaa. new unproved 
2-DE techniques have heen. developed. Therefore, today it is possible to require 
good quality images to guarantee correct evaluation results from a segmentation. 
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3 Previous Work 

3.1 Introduction 

The research in computer-assisted analysis of two-dimensional gel electrophore- 
sis began about twenty yean ago. Mathematical methods for the image process- 
ing were- based on results obtained in fields such as image processing, co mpu ter 
vision and artificial intelligence. Today, there is a big race between several big 
companies in 2-DE software business. Yet, none of theso companies has won 
the market and it does not exist any standards in 2-DE image analysis. 

In parallel with all thnse companies, there arc several different approaches to 
perform 2-DE image segmentation. "When the development of these approaches 
began, computer performance -was a main issue. Since that time, computers 
have become tens and hundreds of times more powerful and today the main 
focus lies on correctness, automation and reliability. 

In this section two of the well known approaches to the segmentation will bo 
briefly introduced. For a more complete presentation of these- and other used 
methods, see Pedersen [6]. It is often a combination of different approaches that 
will lead to good and well performing 2-DE Image segmentation. 

3.2 Watershed Approach 

In the Watershed approach the 2-DE image is regarded as a landscape with Mfl« 
and valleys. Gray level value determines the height of the landscape. The goal 
tor the Watershed is to divide each valley into a separate region, so that the 
whole image is divided into a mosaic In Figure 5, the concept of the Watershed 
in one dimension in shown. 



Figure 5: Two figures explaining the watershed concept In one dimension. In 
each local minimum the water enters and fills each catchment basin. When 
water from two different basins meet, a dam wall is built. 

To create this subdivision the Watershed uses a technique, which could be 



the local minima holes are drilled, so that water can flow into the valleys. Con- 
tinuing, the whole image with the holes is lower into a lake of water. As it Is 
lowered, water win start to flow through the holes and OH (he landscape. When 
water from two different regions meet, a dam waU io built to prevent the water 
bora mixing. As the water level in the landscape rises, more and. more walls are 
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raised. Finally, when the whole landscape has been immersed into the lake, a 
complete subdivision of the image has been formed. 

The segmentation problem n now transferred into finding local minima In 
the image and to decide which regions in the mosaic image that are connected 
to protein spots. 

The main disadvantage with the Watershed approach id the tendency of over 
segmentation, due to noise in Images. This can be avoided bv Scale Space Water- 
shed and Marker Controlled watershed. For more delaOs about the Watershed 
approach see Vincent [7], Pedersen ftf) and WaDmark [8]. 

3.3 Spot Modelling Approach 

The assumption in this approach is that the protein spots io an image hare 
some c ommon characteristics that can he captured by a model. The Idea is to 
And parameters that can change the model so it fits with danercnt protein spots. 
A model C(x t y, 6) is defined, where ac and y denote the position of the model 
and 0 its parameters. The goal is to optimize the model to the image /(*, y), 
so that the error is minimised. This can be expressed as 

0 » org min £ (/(x, y) - C(*. p, $))* (1) 



where x,yew and w is a region in the image 7(*,y). The region to has to be big 
enough to contain a spot, but small enough to avoid containing multi ple spots. 
A model commonly used is the Gaussian model given by 

... <»-«>o a 'f-vor 
CV(r,jr,*) = £-t-ce ~* e (2) 

This model can be modified with a diffusion equation, but this will not be further 
Investigated here. 

Advantages with this approach is that it does not depend on predefined 
maxters to find protein spots in the image It can also module spots that are 
saturated and spots that arc Isolated, in a good way. Difficulties arise when 
spots overlap, which they do very often In 2-DE images. Instead of creating 
a module for two small spots one big spot is selected as the best description, 
because the model in not enough complicated to mode) two small spots. 

For further reading please consult Pedersen [6] and Bettens (9). 
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4 Theory of Image Processing 

4.1 Introduction 

The mathematical branch that deals *rixh digital images is railed imagt pn>- 
cessing. The interest in digital images began in the 1920s, when an image 
transmitting system was built between London and New York. Ever since, this 
scientific field has grown tremendously and today it occupies researchers all over 
the world. 

There exist some fundamental steps in image processing thai are used by 
a common image processing system. The first step is the image acquisition. 
Digital monochrome images are represented by a light Intensity function /(r, y), 
where x and y denote spatial coordinates. The value of / at. any point (x, y) Is 
proportional to the brightness of the image at that point. A digital image can 
also represent a color image. There are several different ways of representing 
color images digitally and oue of them is the RGB-system. Instead of having a 
single valued function y) for a monochrome image, as above, the function 
becomes three-valued. In each point y) a value is given for the redness, 
greenness and blueness in the image. 

Next, after the digitalked image has been obtained, the image preprocessing 
Btege Is conducted. In this stage the image is improved to increase the chances 
for success of other following atagca. AOer the preprocessing stage follows the 
tmaoc analysis. Here, the image could be segmented, represented, described, 
recognized and interpreted. 

Below, some techniques far Image preprocessing and image segmentation will 
be further discussed. A part Mth image modelling will also be introduced. The 
interested reader is referred r.o Gonzalez (10J for further reading about the other 
parts in Image processing. 

4.2 Image Preprocessing 

lb improve the chances of success in a following image processing stage, some 
image preprocessing is done. Techniques to remove noise in images will be 
introduced below. 

Ib remove noise in images a filter of some land can be used. Filtering the 
image is the same as running a mask through the image. A mask could be 
treated as a function given by 

0{*>V)-T[f{x,y)\ 

where f(x t y) is the input imagr, g(x,y) is thn processed image, and T is an 
operator of /, defined over somo neighborhood of {x,y), Normally, the neighbor- 
hood about (ar, y) is denned by a square or rectangular subimage area centered 
at (*,!/), as Figure 6 shows. The mask is run througi an image when the op- 
erator T has visited all points (x, y) in the image and thus generated a new 
modified image g(x, y). 

Noise in images is often present as small discontinuities. This m eans that 
only one of several pixels in a neighborhood has been disturbed by the noise. 
If the noise has this property, it can be removed by taking the mean of a small 
neighborhood around each pixel. The most commonly used mean filter Is by far 
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Figure 6: A three by three neighborhood about a point (x f y) in an hnaga- 
the Gaussian filler of the form 

C-e-^i^ (3) 
In Figure 7 two diifcrent GauBsian filters with different yigma arc shown. 



Figure 7: Two Gaussian niters. Left: a — 10. Rigtit: 0- — 100. 

It 13 also possible to utiliao a simple mask, with ones in each position, as 
a mean niter. The si** of the filter determines the size of the neighborhood 
to create the mean value from. An example of such a mean filter is given' in 
Figure 8. 



i[ l 1 M 

9 1 1 1 
A 1 1 1 J 



RgureS: Mean filter with a ska of three by three pixels. 



4.3 Image Segmentation 

An Image is said to be segmented when H is subdivided into its constituent 
parts or objects. The level to which this subdivision is carried out depends on 
the problem in question. Thus, the segmentation should stop when the object 
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of interest in an image has been isolated from thxt nthcrnoa interebting pans or 
objects. This can be -written as 

« o 
au3 - a 



where A syniboUxes the objects of interest, B symbolizes the non-interesting 
objects or parts of the image and C denotes the image. 

Segmentation, algorithms for gray level images generally are based on one of 
two rundAmenial gray value properties. The first category looks to Discontinu- 
ities in iningos based on large ur abrupt changes in gray level values. Algorithms 
with ihia property look for isolated points, which differ from the surrounding, 
edges and lines In images. The second category of algorithms looks for similar- 
ities in the interesting objects and parte of images.- Principal approaches are 
thresholding, region growing and region splitting and merging, i*<*h segmenta- 
tion categories will be discussed briefly below. For a more complete presentation, 
see Gotaale* |10j. 

4.3.1 Detection of Discontinuities 

As mentioned above, the detection of discontmulties is based on abrupt changes 
in gray level values or isolated points. There exist three basic fcypes of discon- 
tinuities, which are points, lines and edges. A very common and quite fast way 
of finding discontinuities is to run masks through an image - 

Different masks are used to detect different discontinuities. Far example, the 
general mask, shown in Figure 9, can bo modified to detect points, edges and 
lines. 




Figure 9: A general three by three mask. 

In Figure 10, examples of masks used to detect points and lines are abowrj- 
These masks highlight the discontinuities while suppressing other parta of the 
image. 



r -i -i -i i r -i ~* - i l 

-1 8 -X 2 2 2 

L-i -i ~i J L -i J 



Figure 10: Two three by three masks used for detecting Isolated points (left) 
and horizontal lines (right). 
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4.3-3 Gradient ana Laplaciaxx of Images 

When looJdns for discoattimiiUra In Images the gradient and the Laplad^ of 
images become important tools. These properties are also used m many other 
situations* 

The gradient of an image /(z,y) at location (x, y) is the vector 



■(S)-(H) 



(4) 



An important quantity la the magnitude of the gradient. It is often referred 
to 3imply as (ho gradient and la given by 

In vrards, this quantity equals the maximum rate of increase of /(*, v) Pfrumt 
distance in the direction of Vf. The gradient is often approximated with ihe 
computational faster absolute values. According to 

The partial derivatives df/Bx and df/dy can be derived in manjr dhfeent 
ways Here again, masks can be used to find these quantities. One of the 
cc^unon mastoused Is the Sobel operator. In Figure 11 the Sobel mask la gjven 
for the x and y direction, respectively. 

r-l - 2 -iH-i o -U 

0 0 0 -2 0 -2 
[ _ x - 2 -1 J L —A 0 — i J 

Figure 11: The Soble masks used to compute ihe partial derivatives. 
The Lonladan of a two-dixaenslonal function /(r, y)iaa second-order dcriva- 

This quantity may also be implemented in numerous ways, as for to gradient 
above! A commonly used spatial mask is given in Figure 12. The Lapladan «s 
used m many ways to detect edges and also to investigate image curvature- 



[ o -1 o J 



Figure 12: Most fiequently nsed n»ask to compute the Lapladaa. 
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4.3.3 Detection of Similarities 

This approach and field of image segmwrtatlon algorithms includes thresholding, 

^13?™* ^ "f™ SpUttins mer 6»$- 2n this master's thesis only 
folding will be of inters and again the interested rcaU« is referred to 
Cionzalc* [10] for descriptions of the other technicjues 

vtJSrZ 01 **^ very , COimnon ****** to image segmentation. It is 
straight forward and easy to »se In the standard case. But, in more difficult 
situations the tlu-eeholdiug terioiique is tricky and limiting. 

Suppose an image has a histogram as the one givsn in Figure 13- A histogram 
to the frequency function h{t) of an image /(*, v ), where t is the Querent ray 
levela represent; ug the Image The histogram « of an image containing an abject 
and a background. The object and the background consist of pixels partitioned 
into two different gray level groups. Bach gray level group is normal-distributed 
and has a mean value quite different form thft other. 




Figure 13: Histogram of an image with an object and a background. 

By thresholding this image, the background and object are classified accord- 
ing to their intensities. This gives 

* ,,W \0 i//(r,y)ST < 8 > 

where pixels labelled 1 correspond to object and pixels labelled 0 correspond 
to background.' In this, case, with a histogram as in Figure 13, the threshold r 
should be set to 175, Doing so would classify the object and the background at 
a hundred percent accuracy. 

4.3.4 Evolving Interfaces in Image Segmentation 

In recent years a new approach to image segmentation has been developed. It 
is based on evolving interfaces, such as Snakes, see Kass and other active 
conjuring. The idea Is bo let a front grow from a starting seed, placed inside or 
outside the object to be segmented. By doing so, false noise boundaries due to 
artitacts can be avoided. In Figure 14 below, an example is given. In the image 
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a whits object pierced with small holes against a black background ia shown. 
The desired segmentation result should be the larger outer boundary. 




o 



figure 14: Three images showing the difference between thresholding and evolv- 
ing interlace segmentation. Left: Original image. Center: Segmentation with 
thresholding. Bight: Evolving interface segmentation. 

The two different approaches to image segmentation, shown in the center and 
the right image in Figure 14, illustrate the advantage with evolving interfaces 
versus thresholding. The power of the evolving interface approach, is that it 
has the ability to naturally execute topological changes and has good stopping 
criteria, built on the gradient. With the help of the Fast Marching Method, 
introduced in Section 5, the evolving interface approach to image segmentation 
id a fast and very flexible segmentation method. 

Recall, this master's thesis is based on segmentation with the help of evolving 



4.5.5 Image Segmentation Criteria 

It Is very important that the base for image analysis is rounded properly. There 
cast several desirable properties which the segmentation has to satisfy to be a 
solid base to stand on. A well performing segmentation should be 

1. Pant Often the segmentation is one of many steps in a long chain during 
an image analyze. Therefore, it should be as rast as possible to reduce 
computational time. 

. 2. Rcpzatablc. It should be possible to repeat the same segmentation over 
and over again with the same result. 

3. - Robust. To get a correct and reliable result the segmentation need to be 

robust. The segmentation system also has to function properly even in 
unusual situation. 

4. Accurate: It is necessary to have an accurate segmentation so that the 
segmented objects are as dose to reality as possible. 

The above stated criteria are sometimes difficult to satisfy all at a same time. 
For example, the segmentation is Caster If it is less accurate. On the other hand 
it becomes less reliable. In common problems the most important criterion has 
to be satisfied at the cost of others. But, in some applications all criteria have 
to be satisfied to have a fully operational segmentation. These are very difficult 
segmentation problems and require highly specialized segmentation algorithms. 
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4.4 Image Modelling 

Sometimes it Is interesting to obtain gray level values in images between known 
intensity pixels. In the "real world 0 the objects represented by gray level vetoes 
are coirtinuous. In digital Images they have been dxscxetfeed. Inns, to get a 
value between a neighborhood of discrete pixels the value has to be interpolated 
from the neighbors. This could in some 9 ease be called image modelling. 

There exist several different interpolation algorithms with different prop- 
erties, but here only the bilinear interpolation will be discussed. For other 
interpolation methods, please refer to Gonzalez [10]. 

To calculate the intensity in a point between discrete pixels, the bilinear 
interpolation uses gray level values of the four nearest neighbors to that point* 
The intensity in a point, p, is given by 

/(pj = /ied+iaac + J3W+/ A «6 ($) 

where the intensities, of the four known corners, placed in the centers, of the 
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Figure 15: Bilinear interpolation. 

surrounding pixels, are denoted If-U, and a, b, c and d are defined as in 
Figure 16. 
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5 Theory of Evolving Interfaces 
5.1 Introduction 

Evolving interfaces ©cist In varying settings in our surrounding. These are, 
among many- ocean waves, material boundaries and burning flames. Even 
though ft seems strange, hand written characters, shapes against badigrounds 
and bo-intensity contours in images can also he described with moving inter- 
feres, see Sethiao [12]. 

A boundary is an example of an evolving interface. It can be described as 
a curve in two dimensions or a surface in three-dimensions, from here on, only 
the two-dimensioQal case is discussed, if nothing else is mentioned, imagine 
that the curve separates the inside from the outside and that it moves outwarda 
under the known speed function F. The curve expands in its normal direction. 
Below, in Figure 16. an example of a propagating boundary is shown. 




Figure 16: Propagating curve in normal direction with speed P. 

The aim with evolving Interface problems Is to track the above curve as U 
evolves. First thing to do, in an attempt to describe the motion of the tnr*rfrxr», 
Is to find the speed function F. In a typical boundary formulation, the speed 
function might depend on the following factors 

• Local Properties (£), These properties are controlled by the local informa- 
tion. They are, for example, normal vector and curvature. 

• Global Properties (G). Global properties depend on the shape and position 
of the front. For example, the speed also depends on associated differential 
equations. 

« Independent Properties (I). These are boundary shape independent and 
are decided by the underlying structure, in which the boundary is propa- 
gating. One example Is a constant gradient that effects the movement of 
tne interface. 

Thus, the speed function F with the above stated properties can be written 

as 

F^F(L t G % I) (10) 
The speed function determines the whole interface evolution. Therefore, it 
is very important that an adequate speed function is chosen to describe the 
boundary formulated problem. 
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5 THEORY OF EVOLVING INTERFACES 17 

To be able to track interfaces succesafuDy, an Eulerian framework, sec Sethian [12], 
musi be set up. As Eulerian formulation is suitable because the aaderjying co- 
ordinate system remains fixed and does not vary in time. 

5.2 Formulating the Differential Equation 

One way of describing evolving interfaces Is with the help of differential equa- 
tious. There exist two main options to formulate the interface propagation as 
a differential equation. The tint option is to let the speed function F always 
have values greater than zero. Then, the positions of the propagating trout can 
be monitored by calculating its arrival time Tfo y) at the position (z, y). The 
equation for the time function is easily found, because distance => rale * time, 
which gives the Eflcoaal differential equation 

\VT\F = 1 T T-0 on T (11) 

where V is the initial location of the interface. 

Figure 17 bdov, shows the one-dim agonal sonup of the derived differ ential 
equation. 




Figure 17: The one-dimensional setup of the derived differential equation (11). 

Now the front motion can be characterized as the solution to a boundary 
value problem. If the speed function F depends on the position- of the front, the 
differential equation will become a non-linear Eikoual equation. 

The second option to formulate the differential equation does not demand 
that the speed function F is strictly positive. It Is a more general formulation 
which allows the Front to move past the same position Or, 2/) several times. 
When F is negative, the front can move backwards and revisit an already passed 
position. Hence, the crossing time function T(r, y) is a multi-valued function. 
To describe the motion of the front, a higher-dimensional function <f> with the 
fronts initial starting point as the zero-level set, must be introduced. This leads 
to the initial value formulation 

* + .FlVd«0, given <*(*,* = 0) (12j* 

In this master's thesis only the boundary value fonnidatirm will be of interest. 
The initial value formulation solved with numerical approximation sc hemes is 
called The Level Set Methods. Interested readers are referred to Sethian [12]. 
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ftetunring to the boundary value formulation again, recall that the interface . 
evolution will be known when the EiVonal equation (11) has been found. One 
problem in solving these equations is that the solution docs not have to bedif- 
ferentlable, even with smooth initial boundaries. How can the solution to these 
differential equations solutions be found? One way is with the approximating 
numerical Fast Marching Method. 

5.3 The Fast Marching Method 

The Fast Marching Method is a computational technique, which approcornstfca 
the solutions to the non-linear Eikooal equation of the form in Eqn. (11). 
The method also deals with non-differential solutions in a natural way, which 
otherwise could cause problems. 

5-3.1 Non-Differentiability 

To further investigate the solutions that might be noa-dif fr rrntlab le , consider 
an example of a non-convex initial curve and its differential equation gjven by; 

|V71 = 1 (W) 
The speed function F is in this case constant and equal to one [F — i). 
Suppose it is Interesting to know where the interface is positioned after one 
unit in In Figure 18, the initial curve and its propagation in question are 
shown. 



Figure IS: Solution to differential equation given in Eqn. (18). 

The Above result shows that it is possible to end up with anon-dlOerentinble 
solution even with a smooth initial boundary- This imply that the approxima- 
tion schemes used for the EDconal equation must be able to deal with nmv 
differentiable solutions. 

5.3.2 Approximation Scheme 

In the Fast Marching Method, an approximation, of the gradient using up- 
wind differences is used. The upwind scheme is based on Baler's method, see 
Harris [13], far npprcodmating the derivatives. By solving the ordinary differ- 
ential equations outwards along the positive and negative x-axis, in the one- 
dimensional case, the approximation is jpven by 

(14) 



Fi 



(It-M-Ti) _ 



t>0- 
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where To = 0. Hence, each ordinary differential equation is solved away from 
the boundary condition. 

it is possible to approximate the EUtonal equation with an equation of motion 
given by , . 

in the one-diiueaaional case, where 5 b the speed function, see Sethi an |14|. The 
spatial derivative can he approximated with & finite difference approximation 
built on the upwind scheme, see Oshex (16]- This approximation yields 

1>l * (njax(X>r^ 0) 2 +mm(A r **> (*0 
where the standard notation for the finite difference given by 

^^ti^JUzl ^«*tip* • (18) 

has been used. Above, ^ is the value of on a grid at the point xh with grid 
Spacing tu 

Now, the Eikonal equation (11) can be written, in the two-dimensional caee^ 

l VT »~^ max(Z>-*T t 0)* + ininpyT, 0)* ,1 ~ F, , 

where the forward and backward operators and J7+» axe similar to the 
ones defined far the x- direct ion in Eon, (18). A slightly different and more 
convenient approximation, see Bony [17], is given by 

This aquation is a piecewlse quadratic equation for Ty, given that the 
neighboring grid values for T are known. With this approximation the non- 
diffcrcnUabUity in Eqn. (11) is dealt with in a natural way and it will not cause 
any trouble. 

The above approximations lead to information propagating from smaller 
values of T to larger values. Hence, the fast Marching Method rests on solving 
the above equation by building; the solution outwards from the smallest value 

ofT - 

By d efinin g the building zone to a narrow band around the front it is possible 
to make the Past Marching Method very fast, because it does not have to keep 
track of every value in the solution at the same time. The key to the speed lies 
in how to update the narrow band and how to select which grid point in the 
• narrow band to update. Selecting the grid point to update is straight forward. 
The grid point with the lowest calculated T is the one to update. 

5.3.3 Algorithm 

The Above conclusions finally lead to the complete Fast Marching Method, 
which is, in the two-dimensional case 

1. Initialize Giotn Bowidanj: Tag all grid points in the initial condition 
as Known* Continue by Trying as Trial all unknown foun-cotiiiected 
neighbors to the Knom points. Finally, tag all other grid points as Far. 
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2. Get iVcrt Candidate Feint: Let the next candidate grid point be the one 
in Trial with the smallest value of T. 

3. Update Nartvw Band: Remove the candidate point from the set ZHaZ and 
tag it as Known. Tag also all unknown four-connected neighbors of the 
candidate point as Trial K a neighbor is of type Far, add it to the narrow 
hand by removing it from the Far U&t and adding it to the set Trial 

4. Recompute T Values: Recompute the T valuta far all four-connected val- 
ues of the candidate point in triage 2 by solving £cm. (Id). 

5. Loop: While stopping criteria, are not reached, go to stage 2. 

In Figure 19 en example of the fast Marching Method a given- The exam- 
ple Is a boundary value formulation with known boundary value at the center 
grid point. Black grid points represent Knaum, gray Trial and transparent Far 
values. 



Figure 19: An example of a boundary value formulation solved with the fast 
Marching Method. Bach image represents a stage in the interface evolution. 
The center grid point is the initial boundary. Black grid = Known. Gray grid 
^ Trial Transparent grid -7 Far. 



5.3-4 Updating Procedure on Orthogonal Mesh 

A very important aod central role in the "Fast Marching Method is the updating 
procedure for T vahxea. This is the procedure by which aew trial values ore 
created for T in the narrow band. By solving Eqn. (19) a new value of T is 
obtained. But how does this work in practice? 

Solving Eon. (Id), means solving a piecewue quadratic equation for T tjj 
assuming thai the neighboring grid values Cor 7* are given. In the special case 
with an orthogonal mesh, an algebraic solution to Eon. (19) can be found. In 
the following example this will be shown. 
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Bnagfag that an orthogcmal mesh exists, like the one tn Figure 20, below. 
Suppose that the goal with the updating procedure is to calculate a new value 
for T at tho center grid point. 



figure 20: fcxample of an ortbogontti mesh with five grid points. 

The surrounding valued of 7* ore labelled according to Ta ~ T*_i A> Tb — 
T, +lv . = and Tx> = T^. Standing at the center point, the 

Fast Marching Method attempt* to solvfl the quadratic equation given by e^h 
quadrant. In this example only T A , T B and T e act aa contributors. The other 
end point Tjd is regarded. as a Far value and does not have an influence on the 
solution. Thus, there arc three cases to look at when solving the equation and 
they are 

Only T A Is Known: The real solution with the smallest T so that T > T A is 
given by either 

(T — Ta) 2 - 4r 
or 



or 

or 



1 



Ta. and T B an> Known: The teal solution with the analtert T so that T > 
Ta and T > Tb Is given by cither 

(T-TAf + CT-Ttf - 

or 



Ta.» T b end T c are Known The real solution with the smallest T so that 
T > Ta, T > Tn and T > 2c iseiven by 

(r-r A ) 2 + (r-T i ») 2 4-{T-r c ) a = ~r 



06/19 ONS 14:59 FAX 0482868749 TEWOPOL AB 

0462868749 



THEORY OF EVOLVING INTERFACES 



22 



It is easily seen that the number of equations to be solved (Af ), related U> 
the number of neighboring grid points with Trial values for T (n) is 

5.4 Efficient Sorting for the East Marching Method 
To keep the Past Marching Method fast, a god sorting technique must be uti- 
lized. The reason for this is that after every narrow band update, aUThal values • 
in the narrow band have to be resorted. This is guarnntind that the smallest 
value of T will be the next candidate. A heap Boning method is a very good 
method choice to keep track of the Trial values. 

5-4.1 Heap Sort 

Heap sort is an optimal way of sorting information. It does not require any 
extra space and has a time complexity of O(n/0o(n)) t which is very fast. The 
aorting technique Is based on heaps. This data structure is explained in the 
following. 

A heap is a tree where every node has a key mora extreme (greater or less) 
than the key of its parent. The tree must be a complete tree, which means that 

1. tt is empty or 

2. Us left subtree is complete of height fc- 1 and its right subtree is completely 
full of height 7i -2 or 

3. its left subtree Is completely mil of height h - 1 and its right subtree is 
complete of height h — 1. 

Figure 21 below, shows an example of a max heap. 




Figure 21: Example of a max heap with ten nodes. 

m practice, a binary tree can be represented by a vector. This Is a very 
efficient way of implementing a heap. The vector representation docs not require 
any extra space if the tree is complete. By numbering the nodes fcom top to 
bottom and left to right, the children of node i ore at 2i and at 2i + ■ I if ttey 
exist, and tho parent of node i is at |f/2| (integer part of the division) if tt 
exists. Figure 22 showo a ruin heap and the equivalent heap vector. 

lb use the heap sorting algorithm in the last Marching Method certain 
operations on the heap must be supported. These are 
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Figure 22: A min heap with sax, elements and ita equivalent array. 

1. Remove amaBcst value oCT in the heap. 

% Add on element to the heap. 

' 3. Update a key value at any given position In the heap. 

These operations should be possible to complete wWeS^^g ^ the 
heap remains proper. For further information please refer to Jeffrey [15]- It is 
also possible to extend the heap property to handle equal key values, Tbe OTto 
ot Lac co.ua! bey values Is not guaranteed, but in this application it is irrelevant. 
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6 A Segmentation System 

This section describes the Implemented segmentation system applied on 2-DE 
images. Recall, by segmenting 2-DB images, means extracting as many pro- 
tein spots as possible from the protein pattern in the 2-DE Image. Below, m 
Figure 23 an overview oC the system is given. 



The system uses image preprocessing in an image enhancement stage to 
remove background variations and noise. Thereafter, an imatfe analysis stage b 
performed This is where the images are segmented and mfcrrnatlfm about each 
segmented protein is stored in a structured way. In each of these stages many 
different tricks are used to find the optimal segmentation. The above introduced 
stages and Bubstages are described in this section. 

6.1 Background Variation Removal 

Images received by the segmentation system almost always have varying hack- 
grounds- Therefore, it is difficult to see differences between background and 
protein spots when segmenting several parts of the images at LUe same time. In 
one pari the background might be very brigs* {high gray level values) and in 
such a part the proteins are also very bright. In other parts it is the reverse, 
with dark background and dark osteins. Figure 24 Shows an example of the 
•varying background effect. 

This property makes it difficult to find similarities between proteins in the 
whole image, without including background pixels. By removing the back- 
ground, a more uniform image is obtained, which makes it easier to separate 
background from proteins. 

When removing the background the optimal result would be 

1. Nc**^jduting backgroimd or 

2. Homogenous background with significantly different properties than the 
interesting proteins. 

3. Controlled or no changes to the protein pattern visible in the images. 

In this case only a coarse background removal was needed to improve the 
Imago segmentation result. 




Figure 23: Overview of Segmentation System. 
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Figure 24: An example of the effect of varying background. Two different parts 
of tha same image. 



A simple and surprisingly well functioning background removal, based on 
bilinear interpolation, was created and used. The idea is based on the assump- 
tion that the background varies wilii low frequencies and does not have large 
discontinuities. This makes it possible to build a module of the background us- 
ing bilinear surface blocks. To build the blocks, a bilinear Interpolation method, 
according to Section 4.4, was used. 

The following algorithm to modulate and subtract the background was used 

1. Decide Block Size: The size of building blocks was decided based on the 
size of the Image. A large image gives large block size. 

2. Divide the Image in Parts With the Same Size. as the Blocks: At the 
fringes the size of the image parts was chosen bo that they, together with 
the other parts of the image, filled the whole image. 

3. Calculate Mean Intensity in Bach Part of the Image: In each part, the 
mean intensity was calculated and stored in a vector. 

4. Take Mean Values to be the Corners of the Modulated Blocks 

5. Create Modulated Blocks With Bilinear Inlervolation Between the Corners 
for Bach Block 

6. Put the Modulated Blocks in a Background Image 

7. Subtract Original Image With Background Image 

6.2 Noise Removal 

Similar to background variations in the images, nofee in the images is always 
present in varying amount. The reason to remove noise in the images is to make 
the Initial seed selection, discussed in Section 6.3, caster. 

Noise was removed In the images with a combination of a mean (ones filter) 
and a Gaussian filter, introduced in Section 4.2. The two niters were first 
convoluted with each other. Then the resulting filter was applied on the image. 
In this way only one convolution of the filter with the image was needed. Again, 
the filter sizes were determined by the sfce of the image. 
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6-3 Initial Seed Selection 

One of the main issues In this segmentation system was to Bad good initial 
seeds. The initial seeds Conn a base for the evolving interface stage, discussed 
in Section 6.4. Each protein spot in the image should he marked with an initial 
seed. If a protein spot is not marked with a send it wit) uot be regarded aa a 
protein. Instead it will be part of the background. In Figure 25, some proteins 
with corresponding initial seeds are shown. 




Figure 25: 2-DJE image with initial seeds marking protein spots. 

In 2-DE images the protein spots arc represented by varying shapes and 
sizes of gray level valleys, see Figure 26. This protein spot property is used 
to find initial seeds that will mark each protein. Locating all local minima in 
the image and marking them as Initial seeds is a good starting guess. A local 
minima identification on its own is not a sufficient criterion for protein spots. 
The use of some more protein spot specific characteristics have to be used to 
improve the seed guess. The Following three properties win be used to identify 
initial seeds 

1, Local Minimum 

%. Curvarure 

3. Intensity 

They arc all discussed, below, in the following sections. 
6.3.1 Locating Local Minima 

One way to locate local minima in images is to look at the gradient and the 
Lfplacfrn A local minimum is defined so that the gradient has a zero crossing 
and the Lapladan is negative. This approach works well in theory, but not In 
practice. In the discrete images which contain protein spot valleys, the gradient 
might never become zero. Also, the Lapladan might not be negative at the 
lowest value in the valley. Another approach is needed to find the local minim* 
in the images. 

Instead, the used method to identify local minima Is based on pixel intensi- 
ties. A local minimum ]& defined as a pixel with the intensity value, /, whose 
eight-connected pixels all have a value greater than or equal to I. In rare cases 
a local minimum region might get two or more markers- This happens when 
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Figure 26: A 2-DE image with protein spots and tw three- dim en<rionaJ repre- 
sen t ation. 



a ridge with higher values pushes up in the middle of tHe region. This is a 
desirable feature which increases the chances of proton detection. The follow- 
ing algorithm describes the method. If & whole region is a local minimum, the 
region only contains one marker. 

1. Get Center Pixd and Ha Eight- Connected Neighbors: The fringe of the 
image is neglected. 

2. For Each Neighbor Look at Its Value: The lowest neighbor value is stored 
in a variable. 

3- Neighbor has higher or Equal Value: if neighbor is marked as local mini- 
mum unmark it* 

4. All Neighbors ffave Higher or Equal Values: Mark center pixel ad local 
minimum. 

5. Loop; While not end of image, go to Stage 1. (Walk through the image 
from right to left and top to bottom.) 

6.3.2 Decision Based Seed Rejection 

Wlien locating local minima it often occurs thai the local minima are situated 
in a background deviation, such as noise or artifacts in the image. To avoid 
treating these local minima as parts of a protein and using them as initial seeds, 
some decision based rejection method has to be used, A good way lo find out 
if a local minimum Is situated in a protein spot, is to look at the curvature in 
the local minimum surrounding. 

The theory of finding a value of the curvature in an image was derived in 
Section 4.3.1, above. The curvature was calculated by fading the Laplacian of 
the Image f(z, y). A convex curvature is represented with values above zero and 
vice versa for the concave curvature. 

Thus, a protein spot which has concave shape, has negative values in its 
local minimum surrounding. By denning the six* of the* surrounding and the 
amount of concavexiess the protein spot has to show, a more robust initial seed 
Identification can be done. 
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In some protein spots, it can occur that the Spot is saturated, which yields 
that the bottom of the spot is cut off. By Introducing a third criterion for the 
aeed identifications, these kind of protein jpots can also be identified, despite 
their non-concnveness. Figure 27 below shows the difference between an inverted 
saturated and non-saturated spot. 




Figure 27: Inverted non-saturated (Left) and saturated (Right) spot. 



6.4 Evolving Interface Stage 

When the initial seeds that represent each protein spot have been located, the 
evolving interface stage Is started. This Stage ia used to grow a region around 
each initial seed. After the region growing process Is done, each seed has its 
own region associated with it and, thus, a part of the 2-0E image containing 
that specific protein >ype. This is how the segmentation is done. 

The theory of evolving interfaces was introduced in Section 5. Evolution 
of the interfaces is based on the speed function delivered to the Past Marching 
Method. The Initial seeds have great Impact on the result, but they have al- 
ready been identified and are assumed correct, at this stage. Remaining, is the 
definition of a solid speed function and the stopping criteria that wfll halt the 
segm en tatlon. 

6.4.1 Defining the Speed Function 

The definition of the speed function used by the fast Marching Method ia of 
great importance. Therefore, much effort has been used to create and iden- 
tify the best speed function. A good speed function should have the following 
properties 

1. High values inside protein spots. 

2- Low or zero values elsewhere. 

The reason to create a speed function with these properties, is to be able to 
get good stopping criteria for the interface propagation. 

To define a good speed function, the nature of the protein spots have to be 
well know. Below, in Figure 28, a protein spot is shown in both one and two 
dimensions. As already known, the protein spot Is represented by an intensity 
valley. Looking at the gradient might be a start, in Figure 29 the gradient of 
the protein spot, in Figure 28, Is shown. 
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Figure 28: Protein spot in one and two dimensions. 




Figure 29: Gradient of protein spot in one and two dimensions. 

By combining both the gradient and the intensity values from the image, a 
good speed function is obtained. The speed firoctlon is defined as 

where a is a constant, W*, y) an Intensity (/(*, y)> based function and G s {z, V) 
a gradient ((?(*♦!/)) based function. It should be mentioned that this speed 
function is only of the type F{1). where / is independent properties, e^laincd 
in Section U. This is the most simple type of speed functions and it does not 
take the local and global properties into account. 

lb be able to achieve the above properties far the speed function, -Wi V> 
and Gsix.v) axe created from each initial seed and siimmed outwards bo the 
actual position (*, y). Below, in Figure 30, a schematic view of the creation of 
Ts(*,y) »ad <3 5 (x, y) is given. In the figure, *T denotes Xnottfa values and i 
denotes 2Kaf. values, Non-marked positions are For values. The speed m Tv * 
given by 

F (2W = 0 *4ifa(3V)*^(^) 



where 



The idea is that for each step outward from the initial seed position, the 
value given is a summation of the mean values from the accepted neighbors to 
that point. The following algorithm shows the calculation of /<?(*. Vh 
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Figure 30: Schematic view over h{*:V) G$(*,V) creation- K denotes 
Known valued- T denotes Trial values. Nou-martad denotes Far values. Up- 
dating speed value in 7V- 



1. Get Point (*, 3) /or WfcicA to Cnfcufote fcf*, jr); This is simply the posi- 
tion in which the speed is to be calculated. 

2. Find Previous Values: Identify all neighbors to the point that are 
Knotty so Section 5.3.3 for explanation, and take the mean of them. 

3. Add Previous Value With Current Take the mean value from Stage 2 and 
add if. with the value in (x,t/). 

4. store nettf Value: Store and return the value to the surrounding. 

Tb improve the properties of the speed function even more, the gradient 
image G t on which C«(*,tO Is based, has been manipulated, Bnagrnft a very 
email protein spot. The gradient of the spot is not very high, even at the 
borders. Around the spot there is only a homogenous background with hardly 
any gradient at all. Ib acquire a correct Speed function in such situations, the 
gradient image is multiplied with a factor to raise the gradient in the background 
regions- 
Figure 31 shows the speed Junction for the above protein spot, shown m 
figure 28. As can he seen, the speed function has full filled the properties that 
were desired. 




Figure 31: Derived speed function for a protein spot in one and two d i mensi ons 
(a-1). 
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6 A SEGMENTATION SYSTEM 31 



6.4.2 Stopping Criteria. 

The fiegrnentatkm of the proteins has to atop when all of the pro wins have been 
isolated. It i& Important to have good stopping criteria so that a robust and 
accurate segmentation is achieved. 

Two well functioning stopping criteria in this segmentation system axe 
on the time evolution, T(z, y) Tor the Fhst Marching Method. When the proteins 
have been isolated, the speed of the evolving Interface will decrease sharply and 
the tune between each evolving step will increase. Thus, the time gradient will 
grow enormously an suites perfect as * stopping criterion. The total consumed 
time will also increase rapidly and works as a second stopping criterion. 

A combination of these two criteria g^es ^ halting condition for 

the segmentation. 

6.5 Data Extraction 

The final stage in this segmentation system is the data extraction stage. Here, 
the total Intensity, position and curvature of aU protein spots are stored in a 
text file. 
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7 Results 

The evaluation of the segmentation system is done in three major parts. In toe 
first part, teat Images are created and used to identity weaknesses in the syia&xx- 
Cantinuing, an evaluation is done with the use of real 2 T DE data images. Finally- 
this segmentation system Is compared with a software on the market. 

The aim with this testing is to see if the system Is able to handle ail dificultlcs 
with 2-DE image segmentation, discussed In Section' 2. The result presented 
here has not been generated with only one setting of the different changeable 
parameters in the segmentation system. The following parameters are possible 
' to change 

Gauss Filter Size The size of the Gauss filter used to remove noise in the 
images. 

Mean Filter Size The size of the mean filter used to remove noise in the 



Spot Curvature Threshold The curvature each local surrounding 
has to have, to be regarded as an initial seed lor a spot. 

Spot Search Space The size of the surrounding to a local minimum, tor which 
the curvature is calculated. 

Background Filter Site The image is divided into blocks of this size. Each 
block generates a mean value to be used in the' background 7fK"*"Mmg 
function. The larger filter size, the more coarse background module. 

There exists ah. automatical parameter setting function in the system, but 
It was turned off during these tests, Instead, the parameters were manually 
optimized. With automatic values, the result would look very much the same. 

Some figures in this section contain three-dimensional plots to illustrate re- 
sults and intensity images. These plots have been Inverted, so that a protein 
valley is instead represented by a mountain. 

7.1 Testing Environment 

Tests were performed an a PC with a PII 330 MHz processor and 3S4 MB inter- 
nal memory. The environment used for tests and the implemented segmentation 
system were developed in Matlab 6-1. Several of the mam features in the system 
were implemented with the Matlab Fivtemal Interface and mote specific the C 
Mec-files- 

7.2 Test Images 

In this part several images were created to test the systems ability to deal with 
different difficulties. All the spots presented in In* test images have been created 
with Gaussian properties according to Eqn, (2), given in Section 3.3- Below, 
four different topics arc introduced and discussed. 
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7.2.1 Overlapping Spots 

One of the major difficulties in segmenting 2-DE images is rtu> overlapping spot 
problem. In Figure 32, two different Images are shown and their respective 
segmentation result. 




Figure 32: Overlapping spots, left column: Two spots fairly separated. Right 
Column: Two spots too dose. 

The result above shows that two spots nave to be enough separated, so that 
each spot has a local minimum associated with it. Otherwise, the segmentation 
will fail. 

7.2.2 Noise in Images 

Xt often occurs that the 2-DE images contain noise. Therefore, the segmentation 
system has to bo non-affected by the noise. Below in Figure 33, two different 
spots are present in an image with applied noise. The noise is uniformly dis- 
tributed on the interval (0.0,0.1)'- Observe that the Image value varies between 
zero and one, which means that 10% noise was added. 

The preprocessing uoise removal takes care of the noise in a robust and 
accurate way. IT the noise became ton high typically 40% of the highest Intensity, 
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Figure 33: Two spots in an image with noise applied. 



a correct segmentation was not possible. The reason was that under these 
circumstances feint spots drowned in the noise. 

7.2-3 Images with. Background 

The background in the 2-DE images is sometimes very varying and the segmen- 
tation system has to perform well even fin such situations- Figure 34, below, 
shows an image containing two spots with different site and a sinus-varying 
background given by 

hacktjround{x) — A * sinfa ♦ x) 

where A » 0.3, <p -° 0.1 and x — 1 ... 60 was used- 

By applying the background removal filter, described in Section 6.1, the effect 
or the background is minimized and the segmentation system has no problem 
with this kind of background. Again, with too much background variations, 
the segmentation tailed, because faint spots diminished in the background. The 
amount of background variations needed for this situation to occur is rarely seen 
in real 2-DE images. 

7.2.4 Different Sizes and Shap^ °* Spots 

In 2-DE images the protein spots vary In sUe, shape and intensity. To see if 
the segmentation system can handle this diverse range of different spots, a test 
image with several spots was created* The result is presented below hi Figure 36. 

As can be seen by the result, the segmentation system is' able to deal with 
different spot sizes. Intensities, and shapes. 




Figure 35: Test image with several different spots. 
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7.3 Real 2-DE Images 

In this part, two real 2-DE images axe segmented. The Images and the result is 
shown in Figure 36. 



Flgurn 30: Two real 2-DE images that have been segmented. 

About 320 protein spots were located in each image. All of them seamed to 
be correct identifications. In rare cases a protein spot was over segmented, doc 
to characteristics in the 2*£>£ images. An example is the large protein spot f» 
the left image located in the upper center. , 

It takes about 30 seconds to do a complete segmentation of an 2-DE image 
with a size of 1000 by 1200 pixels. Recall that much of the computataons are 
done in MalJab, so if this system is implemented in a faster environment the 
time to perform a segmentation will be greatly reduced. 

7.4 Comparison with Software on the Market 

The segmentation system was compared with & software on the market. This 
software won a competition recently against other leading 2-DE analysing soft- 
wares, which makes it a good competitor. 

First, the result from the overlapping spots problem was compared* see Sec- 
tion 7.2.1. Secondly, the test image with noise and several different spots, iu 
Section 7.2.4, was put to the test. Finally, the. segmentation of two real 2-DE 
images were compared between the competitor and the implemented system in 
this thesis. 
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7.4.1 Overlapping Spots 

The segmentation result is shown below in Figure 37. Doth systems produces 
the same result. 



figure 37: Two overlapping spots. Left: Competitor. Eight Implemented 
system. 



7.4.2 Noisy Imago with Different Spate 

la Figure 38 Wow, the segmentation result is shown. The competitor is not 
able to Bod two of the spots, which the implemented .segmentation, system does. 




Figure 38: Noisy Image containing several different spots. Left: Co mpetitor . 
Right: Implemented system 



7.4-3 Real 3-DB Images 

Jh this comparison there is some HtffWiww* between the results. In both cases 
the number of i d ent i fied spots by the competitor were close to 260. This is 
a bit lower than the implemented system, which found around 320 spots. In 
Figure 39, the different results arc shown. 






Figure 39; Segmeutafcioxi result of two different 2-UE linages. Upper: Competi- 
tor. Lower Implemented system 
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7.5 Formulating the methods used 

• Method 1 . 

A method fat processing digit ai image data for two-chniensiona] gels by 
using fast Marching Methods, comprising the steps of: 

- defining initial starting points for the methods; 

- generating a speed function for the methods; 

- generating Interface propagations -with said methods; 

. - defining stopping criteria for the Interface, propagations for said 
methods; 

- generating image processing results based an the stopped said evolv- 
ing Interfaces. 

• Method 2. 

The method' as recited in Method 1, wherein tho starting points for said 
methods are generated from the said digital imago 

• Method 3. 

The method as recited in Method 1 ox 2, wherein the speed function is 
d ffpp nrl er j t on said sample image intensities and functions thereof. 

« Method 4. 

The method ae recited In Method 1-2 or 3, wherein the speed function is 
dependent on distances to said starting points and functions thereof. 

• Method 5. 

The method as recited in Method 1-3 or 4, wherein the speed function is 
dependent on said evolving interface curvatures and functions thereof. 

• Methods. 

The method as recited in Method 1-4 or fi, wherein the speed function 
is dependent an said evolving interface normal directions and functions 
thereof. 

• Method 7. 

The method as recited in Method 1*5 or 6, wherein the speed function is 
dependent on said evolving interface positions and functions thereof. 

• Method & 

The meUiod a* recited in Method 1-8 or 9, wherein the speed (unction is 
dependent on Said evolving interface shapes and functions thereof. 

« Method 9. 

The method as recited in Method 1, wherein the stopping criteria are 
dependent on said evolving mterfaecs time evolution and functions thereof. 

• Method 10. 

The method as recited in Method 1 , wherein the stopping criteria are de- 
pendent on said evolving interfaces speed evolution and functions thereof. 
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• Method 11. 

A method for processing digital Image data from two-dimensional eleo 
ijrOphoresis gels by using Level Set Methods, comprising the steps of: 
• defining iaHial starting values for too methods; 

- generating a speed function, for the methods; 

* - generating Interface propagations with said methods; 

- defining stopping criteria for the mterface propagations for said 
methods; 

- generating image processing results based on the stopped said evolv- 
ing interfaces. 

• Method 12. 

The method as recited in Method 11, wherein the starting points for said 
methods are generated from the said digital image. 

• Method 13. 

The method as recited la Method 11 or 12, wherein the speed function is 
dependent on aatd sample image* intensities and functions thereof. 

• Method 14. 

The method ac recited in Method 11-12 or 13, wherein the speed function 
is dependent on distances to said starting points and functions thereof. 

• Method 15. 

The method as recited in Method 11-13 or 14, wherein the speed function 
is dependent on said evolving interface curvatures and functions thereof. 

• Method 16. 

The method as recited in Method 11-14 or 13, wherein the speed function 
Is dependent on said evolving interface normal directions and functions 
thereof. 

• Method 17. 

The method as recited in Method 11-1$ or 16, wherein the speed function 
is dependent on said evolving interface! positions and functions thereof. 

• Method 18. 

The method as recited in Method 11-18 or 19, wherein the speed function 
is dependent mi said evolving interface shapes and functions thereof. 

• Method 19. 

The method as recited in Method 11, wherein the stopping criteria are 
dependent on said evolving Interfaces time evolution and functions thereof. 

• Method 20. 

The method as recited in Method 11, wherein the stopping criteria are de- 
pendent on said evolving interfaces speed evolution and functions thereof. 



'■!-' 
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7.6 Benefits of Proposed Methods 

The proposed methods are superior to oilier known methods for segmentation 
of 2-DB images. One of the most successful methods for segmentation of 2-DE 
images ia described is Bettena (9[, The new methods described here, havo a 
better time complexity and are more memory efficient. They arc also easier 
expanded to include more complex dependencies by changing the property of 
the speed function F{L t G t /). It is possible to lei. the evolving interfaces depend 
on the local curvature of the Interface, the global shape of each segmented pro- 
tein and other properties, for achieving good segmentation results, see Section 
5.1. These method's generality leads to endless possibilities of improving the 
segmentation results. 
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8 Outlook and Further Improvements 

The implementation of & 3-DE segmentation system seemed to be straight tor- 
-ward in tho beginning of this masters thesis* But, as work advanced it showed 
that much efforts had to be done to achieve acceptable and reliable results. 

Several important aspects of 2-DE image segmentation, Such as varying 
background, noise removal and overlapping spots segmentation were just briefly 
touched and needed much more attention. Below, some farther improvements 
axe suggested. 

Noise Removal With the help of more sophisticated filters and frequency 
analysis more of the noise effects could he removed. The Level Set Method, 
an evolving interlace method, could be used to remove noise while keep- 
ing the interested contours intact. It has been used with great success in 
Image preprocessing applications, see Sethian {12]. 

Varying Background Removed By investigating the histogram in different 
parts of the image a better value for the background modulation can be 
achieved. Also, the background modulation could utilise other more ac- 
curate interpolation methods, for example splines. 

Marker Identifications The marker identification is the key to success in this 
.segmentation system and more time should be spent on doing an even 
better Identification of these. Again, different filters and perhaps spot 
models could be developed. 

Create a More Complex Spend Function The speed function could ho ex- 
• tended with curvature information and other properties so that the inter- 
face propagation could be curvature shape dependant. 

Equilibrium Interface Propagation By allowing speed values to be both 
positive and negative an equilibrium segmentation could be possible. By 
doing so, the stopping criteria would be trivial. 

. Adjust Templates to Segmentation Result By defining nrrtam templates 
that are suitable for protein spots, it might be possible to receive even 
better results. 
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. Method for Digital Image Processing 

CLAIMS . 

L t^° d ^ data ftr two-dimensional gels by using 

Fast Marching Methods, comprising the steps of: 

• defining initial startm&points for the methods; 

- generating a speed function for the methods; 

- generating interfece propagations with said methods; 

- defining stopping criteria for the interface propagations for said 
methods; 

• generating image processing results based on the stopped said cvolvmsr 
mtexfaces. 

are generated from the said digital image, 

3. p« method as recited 

dependent otn said sample image intensities and functions thereof. 

4. lie method as recited in Claim 1-2 or 3, wherein the speed function is 
dependent on distances to said starting points and functions thereof 

5. The method as recited in Claim )-3 or 4, wherein the speed function is 
dependent on said evolving interfere curvatures and functions thereof 

6. The method as recited in Claim 1-4 or 5, wherein the speed function is 
dependent on said evolving interface normal directions and functions thereof. 

7. The method as recited in Claim 1-5 or 6, wherein the speed function is 
dependent on said evolving interface positions and functions thereof 

8. The method as recited in Claim 1-8 or 9, wherein the speed function is 
dependent on said evolving interface shapes and functions thereof 

9. The method as recited in Claim 1, wherein the stopping criteria are dependent 
on said evolving interfaces time evolution and functions thereof 

10. The method as recited in Claim 1, wherein die stopping criteria are dependent 
on said evolving interfaces speed evolution and functions thereof 

11 . A method for processing digital image data fiom two-dimensional 
electrophoresis gels by using Level Set Methods, comjxrising the steps o£ 

- defining initial starting values for the methods; 

- generating a speed function for the methods; 

- generating interface propagations with said methods; 

- defining stopping criteria for the interfece propagations for said 
methods; 

- generating image processing results based on the stopped said evolving 
interface. 
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methods are generated from the said digital image. 

"'S^Si"^ 1 *? ^ 11 orl ^^^ the speed function is 
dependent on said sample image intensities and functions thereof 

14 " J^ff^f 3S J^ d ™ 035111 1M2 * whc «k Hie speed function is 
dependent on distances to said starting points and functf onsthei^ 

15. The method as recited in Claim 11-13 or 14, wherein the speed function is 
dependent on said evolving interface curvatures and functions tneST 

16. The method as recited in Claim 1 1-14 or 15, wherein the speed function is 
dependent on said evolving interface normal directions and functions thereof 

17. The method as recited in Claim 11-15 or 16, wherein the speed function is 
dependent on said evolving interface positions and functions thereof 

18. The method as rccitedin Claim 1 1-18 or 19, wherein the speed function is 
dependent on said evolving interface shapes and functions thereof! 

19. The method as recited in Claim 11, wherein the stopping criteria are 
dependent on said evolving interfaces time evolution and functions thereof. 

20. The method as recited in Claim 1 1 or 19, wherein the stopping criteria are 
dependent on said evolving interfaces speed evolution and functions thereof. 
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Abstract 



Since the geaome wiquenced, the tapoxrnnco of pwtfiomks lies 
toccrawd cnoaroousOy lb detect the different protein profile* contained 
« jfH. an i?^ er * ^mrotooal fid el-ctroph^esii method 
wed. It produces real end digital two-diinenateaal protein charts *hich 
are wXyzcd. One of the fust seep* (a th* anaiysi* is to detect the pro- 
teins m tfc di,^ chart. Thie to done with automatic compaterJs*££ 

U» this masttr'e thesia a segmentation system, based on evolving in- 
terfaces, u defined, implemented wd evaluated for the eegmonfctfoa of 
twg fan ea rional gel electrophoresis images. & fa shown chat with the use 
of Past Marching Methods to approximate the evolving Interfaces, the 
segmentation am be swiftly performed with high quality result.. One 
weakness with this implementation is that it Is a marJker based scamen- 

*!? Cn *J^L tW ? Spot * 0vah » °» top of «edi 

othi^ it b difficult to find correct markers. Ibrtunately, occuircnce* of 
this land of extreme overlapping are rare. 
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Abstract 

Sinctt the genome was sequenced, the importance of protcoraics has 
increased enormously. . lb detect ihn different proton profiles containnd 
in cells and other media, a two-dimensional gel electrophoresis method is 
wed. It produces- real and digital two-dhnensional protein cherts, winch 
are analysed. One of the ftrst steps in the analysis is to detect the pro- 
teixn in tfc* digital chart. This is done with automate conrpotcrHsasbted 

In this master's thesis a aepnentation system, based on evolving in- 
terfaces! is defined, implemented and evaluated lor the segmentation of 
two-dimensional gel electrophoresis images. It te shown that with the use 
of Fast Marching Methods to approximate the evolving Interfaces, the 
segmentation can he swiftly performed with high quality results. One 
weakness with this implementation is that it is a marker based segmen- 
tation. When two protein spots overlap and lay almost on top of each 
other, it is difficult to find correct markers, jfortanatety, occurrences of 
this kind of ettreme overlapping ere rare. 
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